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Abstract 

fè-lactam antibiotics are of clinical importance for treatment of bacterial infections. Several 
penicillins and cephalosporins with broad spectra of activity and high stability against 
various p-lactamases have been developed and introduced in clinical practice. Due to 
increasing prevalence of antibiotic resistance, efforts to synthesize more compounds for 
better activity are still on. Traditionally, a combination of serendipity and empiricism has 
been the basis of new drug discovery. Trial and error synthesis of compounds and their 
random screening for activity have proved to be both time-consuming and uneconomical. 
Hence, predicting pharmacokinetic parameters, of a new molecule, in an early stage of drug 
design, is of as high importance as the activity of the compound. With rapid advances in 
computation power of machines and availability of experimental data, these ADME 
properties can now be better predicted by using suitable computational methods. In the 
present study, a quantitative structure-property relationship study of 32 cephalosporins to 
renal clearance was performed with descriptors of molecular structures. Good correlations 
of Renal Clearance were obtained with constitutional and electrostatic descriptors like 
Bond length between C-O, bond length between H-O bonds, maximum bond length between 
H-N bond, number of H-O bonds and charge of all C atoms. High values of R? (0.8397) and 
Q? (0.7746) were indicative of high predictive power of this correlation. Also, lower R? 
RAND value compared to R? indicates that the correlations obtained are not chance 
correlations and hence can be used for prediction purposes. 
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1. Introduction 


Infectious diseases are responsible for a 
significant ^ proportion of | deaths 
worldwide and according to the World 
Health Organization, antimicrobial agents 


are considered to be miracle drugs that 
are the leading weapons in the treatment 
of infectious diseases. Unfortunately, a 
number of the current clinically 
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efficacious antimicrobial agents are 
becoming less effective because of the 
development of microbial resistance. So, 
there is an urgent need for discovery or 
optimization of novel antimicrobial agents 
that are active against resistant microbial 
strains. 

The pharmaceutical industry need to 
develop continuously new drugs in order 
to fight the development of resistance in 
pathogenic agents, and to cope with newly 
discovered types of infections [1]. Since 
ADME (absorption, distribution, 
metabolism and elimination) properties 
are important parameters in lead 
identification, the in silico methods to 
search for drug candidates with good 
ADME properties has attracted the 
pharmaceutical industry [2-4]. 


Various quantitative structure- 
activity/property relationship 
(QSAR/QSPR) approaches have been 


applied to find relationships between 
ADME parameters and molecular 
structure and properties. QSPRs are 
among the most widely used techniques in 
rational drug design, which find 
mathematical relationships between 
physicochemical properties of compounds 
and their experimentally determined 
values. Thus, these derived QSPR models 
can be subsequently used to predict 
pharmacokinetic properties of new 
derivatives. 

Traditionally, a combination of 
serendipity and empiricism has been the 
basis of new drug discovery. Trial and 
error synthesis of compounds and their 
random screening for activity have proved 
to be both time-consuming and 
uneconomical. Further, therapeutic effects 
and hazards to health are assessed using a 
series of experimental and in vivo tests. 
However, usage of animal models is often 
subject to ethical (and financial) 
considerations. Therefore, alternative 
methods have been under development to 


reduce the requirement of animals in 
testing [5]. 

The structural formula of an organic 
compound, in principle, contains coded 
within it all the information which 
predetermines the chemical, biological, 
and physical properties of that compound. 
If we can understand how a molecular 
structure brings about a particular effect 
in a biological system, we have a key to 
unlocking the relationship and using that 
information to our advantage. Formal 
development of these relationships on this 
premise proved to be the foundation for 
the development of predictive models [6, 
7]. 

Quantitative structure-property 
relationships (QSPRs) are mathematical 
models that attempt to relate the 
structure-derived features of a compound 
to its biological or physicochemical 
activity. Similarly, quantitative structure- 
toxicity relationship (QSTR) Or 
quantitative  structure-pharmacokinetic 
relationship (QSPR) is used when the 
modeling applies on toxicological or 
pharmacokinetic systems. QSAR (also 
QSPR, QSTR, and QSPR) works on the 
assumption that structurally similar 
compounds have similar activities. 
Therefore, these methods have predictive 
and diagnostic abilities. They can be used 
to predict the biological activity (e.g., ICso) 
or class (eg, inhibitor versus non- 
inhibitors) of compounds before the 
actual biological testing. They can also be 
used in the analysis of structural 
characteristics that can give rise to the 
properties of interest. 

The explosive development of computer 
technology and methodologies to calculate 
molecular properties increasingly made it 
possible to use computer techniques to 
aid the drug discovery process. The use of 
computer techniques in this context is 
often called computer-aided drug design 
(CADD), but since the development of 
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drug involves a large number of steps in 
addition to the development of a high 
affinity ligand a more appropriate name 
computer-aided ligand design (CALD) has 
also been proposed [8]. 


2. Materials and Methods 


The present study was undertaken with an 
objective to establish quantitative-structure 
pharmacokinetic relationships (QSPR) of 
prognostic relevance in the B-lactam series 
of drugs, specifically cephalosporins. The 
reason to select cephalosporins was 
because such correlations are developed for 
very few drugs. Further, very few reports on 
QSPR were available for this series of drugs 
and that too involving only small sets of 
drugs and few descriptors. Thus, an attempt 
was made to evaluate quantitative 
relationships between structural 
descriptors of cephalosporin molecules and 
renal clearance. 

The work was divided into three phases: 


1. Computation of molecular descriptors 
2. Compilation of pharmacokinetic data 
3. Development of meaningful correlations 


Computation of molecular descriptors 

It is well known fact that the structure of 
drug molecules is expressed quantitatively 
in terms of its physicochemical descriptors, 
which are lipophilic, electronic and steric in 
nature. The physicochemical descriptors 
govern the biological activity of the 
compounds. 

32 cephalosporins, for which experimental 
renal clearance values are available, were 
selected for the study. PUBCHEM database 
contains 2D and 3D minimized structures of 
large number of drugs and other molecules. 
3D structures of selected cephalosporins 
were downloaded from the database and 
used as such for correlation studies. (Table 
1) Structures of cephalosporins in molfile 
format were used as input for computation 
of descriptors. 


Table 1. 3D structure of selected Cephalosporins used in study 


S No. Cephalosporin 3 D Structure 
1 Cefaclor & ws o 
YO. : o9 
3 
W 
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3 Cefamandole nafate 9 a 
o - > a 
EC a 3 
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SNo. Cephalosporin 3 D Structure 
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SNo. Cephalosporin 3 D Structure 
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SNo. Cephalosporin 3 D Structure 
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SNo. Cephalosporin 3 D Structure 
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SNo. Cephalosporin 3 D Structure 


32 Cephradine 


O = thiorme) | 1 


Representative Molfile of one of the cephalosporin, i.e. Cefaclor used in the study is given 
below: 


CEFACLOR 51039 
-OEChem-01232605283D 


38400 1000 0 0999 V2000 
4.3759 -2.8607 0.1803C1I0 00000000000 
0.4211 -1.2636 1.02175 000000000000 
2.3110 2.5640 -0.85430 000000000000 
-2.1507 2.0851 155660 000000000000 
5.3604 0.5101 0.00760 000000000000 
4.3871 0.0327 -2.00610 000000000000 
2.2714 0.5146 0.3114N 000000000000 
-0.6351 1.4084 -0.0623N 000000000000 
-4.0767 2.3920 -0.3999N 000000000000 
1.2515 0.3141 1.330C 001000000000 
0.5689 1.5738 0.7510C 001000000000 
1.8682 1.7394 -0.0895C 000000000000 
3.2440 -0.3919 0.0343C 000000000000 
1.9540 -2.2411 1.2193C 000000000000 
3.1419 -1.6662 04708C 000000000000 
-1.9169 1.6830 0.4179C 000000000000 
-3.0082 1.4114 -0.6118C 001000000000 
4.3725 0.0641 -O.7926C 000000000000 
-3.5054 -0.0155 -0.5080C 000000000000 
-2.8881 -1.0009 -1.2576C 000000000000 
-4.5653 -0.3008 0.3343C 000000000000 
-3.3458 -2.3150 -1.1619C 000000000000 
-5.0228 -1.6150 0.4300C 000000000000 
-4.4130 -2.6221 -0.3182C 000000000000 
1.6341 0.4445 23464H 000000000000 
0.4597 2.3811 1.489H 000000000000 
-0.5495 1.0672 -1.068H 000000000000 
1.7232 -3.2486 0.8573H 000000000000 
2.1915 -2.3136 22862H 000000000000 
-2.6100 1.5863 -16194H 000000000000 
-4.7869 2.2823 -1.02325H 000000000000 
-3.7028 3.3332 -0.5191H 000000000000 
-2.0565 -0.7725 -19177H 000000000000 
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-5.0487 0.4568 0.9434H 000000000000 
6.1360 0.8252 -0.5037/H 000000000000 
-2.8705 -3.0993 -1.7438H 000000000000 
-5.8510 -1.8554 1.00080H 000000000000 
-4.7684 -3.6456 -0.2225H 000000000000 
11510000 

21010000 

21410000 

31220000 

41620000 

51810000 

53510000 

61820000 

71010000 

71210000 

71310000 

81110000 

81610000 

82710000 

91710000 

93110000 

93210000 

101110000 

102510000 

111210000 

112610000 

131520000 

131810000 

141510000 

142810000 

142910000 

161710000 

171910000 

173010000 

192020000 

192110000 

202210000 

203310000 

212320000 

213410000 

222420000 

223610000 

232410000 

233710000 

243810000 

M END 

$$$s 


Software used to calculate the descriptors 
were, QikProp and CODESSA 
(Comprehensive Descriptors for Structural 
and Statistical Analysis). One of the major 


advantages of CODESSA is its large pool of 
molecular descriptors, which are calculated 
for each chemical structure. Descriptors are 
automatically calculated for all structures 
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added to the storage. These programs were 
taken from Schródinger and M/s Semichem, 
Kansas, USA respectively. These software 
were selected on the basis of literature 
reports where it was shown to possess most 
of the desired attributes in the development 
of quantitative structure property 
relationships. 


Steps in QikProp software to calculate 

descriptors included: 

e MOL files were used as input to the 
software by selecting the command 
Project2 Import structures. All the 
molfiles were selected and imported 
into  QikProp. Descriptors were 
calculated by using commands 
Application QikProp and pressing run. 
QikProp calculates all the descriptors 
and creates a project table. 

For developing better correlations, 

additional descriptors were calculated using 


CODESSA. Descriptors obtained in QikProp 
were saved as CSV file for integration into 
CODESSA. CODESSA has a facility to input 
files directly from a CSV or a text file format. 
To calculate codessa descriptors, command 
Descriptors calculate was used. 

CODESSA calculated additional descriptors 
for each of cephalosporin. 


Compilation of pharmacokinetic (Renal 
clearance) data 

For comparing the predicted values of renal 
clearance with actual values, reported 
values of renal clearance of cephalosporins 
in humans were taken from literature [9- 
13]. Different authors have reported 
variable values, all values were taken and a 
mean of value for each cephalosporin was 
calculated. Compiled values of renal 
clearance for all 32 cephalosporins used in 
study are given in Table 2. 


Table 2. Renal clearance values of selected Cephalosporins 


# Cephalosporin feat [omm # Cephalosporin it, [m 
1 Cefaclor 289.5 2 Cefotiam 200 
3. Cefadroxil 128.35 4. Cefoxitin 285.37 
5, Cefamandole 162 6 Cefpimizole 94.5 
7 Cefamandole nafate 225 8. Cefpirome 82.1 
9. Cefatrizine 175 10. Cefprozil 171.5 
11. Cefazolin 52.5 12. Cefroxadine 291.5 
13. Cefetamet 130.3 14. Cefsulodin 85 
15. Cefixime 21.8 16. Ceftazidime 90.87 
17. Cefmenoxime 176 18. Ceftibuten 62.1 
19. Cefonicid 22.26 20. Ceftizoxime 107.33 
21. Cefoperazone 17.86 22. Ceftriaxone 7.87 
23. Ceforanide 4.15 24. Cefuroxime 125 
25. Cefotaxime 160.5 26. Cephacetrile 313 
27. Cefotetan 26.5 28. Cephalexin 195 
29. Cephalothin 252 30. Cephaloridine 130 
31. Cephapirin 340 32. Cephradine 343 
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Development of meaningful correlations 

One of several problems in design of QSPR 

models is the selection of the most relevant 

set of molecular descriptors for the 
property or activity that is intended to be 
modeled. Chemical structures are usually 
encoded by a variety of descriptor families 
such as functional groups, topological, 
constitutional, thermodynamic, quantum 
mechanical, etc. Descriptor selection is the 
process of identifying most relevant 
information rich descriptors from large set 
of available descriptors. . All the descriptors 
generated for each molecule are not 
significant in developing QSPR models. The 
use of all available descriptors in the model 
development process causes poor 
predictions because of over fitting. Only 
significant descriptors calculated by 

QikProp and CODESSA were taken in the 

correlation ` studies. Insignificant or 

intercorrelated descriptors were skipped. 

Correlation studies were carried out by 

“Best Multilinear Regression” sub routine in 

CODESSA. 

Selection criteria and steps used for “Best 

Multilinear Regression” in CODESSA were: 

e Maximum number of descriptors, 
started from 1 and then taken up to 
depending on the number of molecules 
selected. Drug molecules: Descriptor 
ratio was taken as 6:1, which implies 
that not more than one descriptor per 6 
molecules in a series was used for 
developing correlations. For example, if 
there were 21 molecules for a particular 
property, maximum number of 
descriptors used for developing 
regression equations was kept at 3. 
Similarly for a series having 40 
molecules, maximum number of 
descriptors was 6. 

e Maximum number of correlations per 
number of descriptor were kept as 5 

e Correlation improvement cut-off was 
kept as 0.01 


e Maximum r? for orthogonal descriptor 
was kept as 0.5 
e If missing property value, then the 
selection was made to skip structure 
"Best Multilinear Regression" routine tests a 
large number of correlations as each 
descriptor type is analyzed for correlations 
individually for the selected 
pharmacokinetic property. 


3. Results and discussion 


Renal clearance data was available for 32 
cephalosporins, thus, correlations were 
attempted keeping the number of maximum 
descriptors to 5 thereby limiting the drug: 
descriptor ration to 6:1. LOO and y- 
scramble tests were also performed. The 
best correlations obtained with renal 
clearance (CLr) for cephalosporins are 
given in below Table 3. The table lists 
equations starting from 1 descriptor 
equation up to an equation with maximum 
number of descriptors (ie. 5) that can be 
used as mentioned above. 

With the probability of reporting a large 
number of such correlations for each 
property, it was considered necessary to 
change the format of these correlations into 
an equation format. The validity of the 
equation and the relative importance of the 
different parameters used can be judged by 
four statistical criteria; namely coefficient of 
determination R2, Cross validated Ri (Q2), 
Fisher's F value, and R? Rand which is the 
maximum R? obtained after randomizing 
the property values and finding correlations 
with descriptors again. The larger value of F 
indicates higher probability of QSPR 
equation being significant. These methods 
provide correlation coefficient (r), standard 
deviation (s), and ratio between variance of 
calculated and observed activates (F). 
Depending upon the values of these 
statistical parameters, the significance of 
each equation was evaluated. 
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Table 3. Correlations of renal clearance in the series of Cephalosporins 
Equation Q2 F-Value R? 
RAND 
Cle = -333.858*Average Information 0.4689 | 0.4089 | 26.4875 | 0.4021 


Content (Order 1) + 1570.007 
Ce = 1898.612*Average Bond Length fora | 2 |32| 0.6054 | 0.5289 | 22.2454 | 0.4762 
C-O Bond - 382.605*Average 
Information Content (Order 1) - 
642.784 
3405.017*Average Bond Length fora | 3 |32 | 0.7476 | 0.6802 | 27.6409 | 0.5456 
C-O Bond - 647.698*Net Zefirov 
Charge of All C Atoms - 
197.354*Uniform-Mass, Center of 
Mass, X - 3904.678 


4. 3350.916*Average Bond Length fora | 4 |32 | 0.7899 | 0.7128 | 25.3760 | 0.4818 
C-O Bond + -618.266*Net Zefirov 
Charge of All C Atoms + - 
177.44*Uniform-Mass, Center of 
Mass, X-33.616*Number of H-O 
Bonds-3804.09 


3208.327*Average Bond Length fora | 5 |32 | 0.8397 | 0.7746 | 27.2366 | 0.5690 
C-O Bond - 590.784*Net Zefirov 

Charge of All C Atoms - 

205.246*Uniform-Mass, Center of 

Mass, X + 43161.077*Maximum 

Bond Length for a H-N Bond - 

31.141*Fractional Minimum Zefirov 

Negative Charge Times ASASA - 

47542.826 


M = Number of molecular descriptors, N = Number of cephalosporins 


Good correlations of Renal Clearance were 
obtained with constitutional and 
electrostatic descriptors. The descriptors 
that figured in the best correlation 
(Equation 5, Table 3) were Bond length 
between C-O, bond length between H-O 
bonds, maximum bond length between H-N 
bond, number of H-O bonds and charge of 
all C atoms. High values of R? (0.8397) and 
Q? (0.7746), obtained with equation 5, are 
indicative of high predictive power of this 


correlation equation. It is notable that the 
R? RAND value is lesser than the R?, which 
indicates that the correlation equation 
obtained, is not chance correlations and 
hence can be used for prediction purposes. 

As it would be too voluminous to give 
details of each of the equations obtained, 
details of only the best correlation is given. 
The correlation matrix of descriptors used 
in Equation 5 is given in the Table 4. 
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Table 4. Correlation matrix for selected descriptors in Equation 5, Table 3 


Average Net Uniform- | Maximum Fractional 
Bond Zefirov Mass, Bond Minimum 
Length fora | Charge of | Centerof | Length for Zefirov 
C-O Bond All C Mass, X a H-N Bond Negative 
Atoms Charge Times 
ASASA 
Average Bond Length for a C- 1.0000 
O Bond 
Net Zefirov Charge of All C 0.4030 1.0000 
Atoms 
Uniform-Mass, Center of 0.2481 -0.1699 1.0000 
Mass, X 
Maximum Bond Length for a -0.3289 -0.6175 0.2802 1.0000 
H-N Bond 
Fractional Minimum Zefirov -0.3428 -0.4416 0.1628 0.7545 1.0000 
Negative Charge Times 
ASASA 


The correlation matrix indicates that none of 
the descriptors used in the correlation are 
orthogonal with the other descriptors. 

The MLR regression coefficients for 
individual descriptors used in best fit 
Equation 5 are given in Table 5. 

The plots of experimental versus predicted 
renal clearance values obtained are given in 
Figure 1. 


With all the correlations highly significant, 
and the Q? values reasonably high (all > 


0.5), 


some excellent relationships are 


achieved which can be successfully used to 


assess the 


molecules. 


renal 


clearance of newer 


Table 5. MLR regression coefficients and t-values for CLr in cephalosporins 


# Desc. Name Coeff. t p(t) SE 

O | Intercept -47542.8256 -3.3559 0.002442 | 14167.1347 

1 | Average Bond Length for a C-O 3208.3268 6.8197 3.08E-07 470.4483 
Bond 

2 | Net Zefirov Charge of All C Atoms -590.7837 -7.7480 3.21E-08 76.2496 

3 | Uniform-Mass, Center of Mass, X -205.2464 -7.0943 1.56E-07 28.9313 

4 | Maximum Bond Length for a H-N 43161.0773 3.1112 0.004485 | 13872.5941 
Bond 

5 | Fractional Minimum Zefirov -31.1406 -3.7809 0.000826 8.2364 
Negative Charge Times ASASA 
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Figure 1. Plot of experimental vs predicted 
renal clearance (CLR) 


Conclusion 


Good correlations of Renal Clearance were 
obtained with constitutional and 
electrostatic descriptors. The descriptors 
that figured in the best correlation were 
Bond length between C-O, bond length 
between H-O bonds, maximum bond length 
between H-N bond, number of H-O bonds 
and charge of all C atoms. High values of R? 
(0.8397) and Q? (0.7746), obtained with 5 
descriptors are indicative of high predictive 
power of this correlation. Also, R2 RAND 
values lesser than R? shows that these are 
not chance correlations. 
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